Business Analytics

Advanced Data Visualizations

Ayush Patel and Jayati Sharma

25 February, 2024

Pre-requisite

You already….

  • Know basic and advanced data wrangling functions in R
  • Know basics of data visualization in R
  • Can write functions in R

Before we begin

Please install and load the following packages

library(dplyr)
library(tidyverse)
library(scales)
library(patchwork)
library(ggiraph)
library(gghighlight)



Access lecture slide from the course landing page

About me

I am Ayush.

I am a researcher working at the intersection of data, law, development and economics.

I teach Data Science using R at Gokhale Institute of Politics and Economics

I am a RStudio (Posit) certified tidyverse Instructor.

I am a Researcher at Oxford Poverty and Human development Initiative (OPHI), at the University of Oxford.

Reach me

ayush.ap58@gmail.com

ayush.patel@gipe.ac.in

Learning Objectives

  • Learn annotation for graphs in R
  • Learn how to combine graphs
  • Learn scaling functions in R
  • Learn how to make ggplot graphs interactive

Let’s Recap

  • In the data visualization lecture, you learnt how to create various types of graphs using ggplot2
  • Some of them include bar, graph, line graph, scatter plots etc
  • For effective data visualization and communication, any plot requires modifications
  • These include annotations on the plot, modification of axes and scales, highlighting and interactivity of the plot
  • The aim of this lecture is to move beyond making graphs, towards clear and effective visualizations

Annotations in ggplot - Text

Content for this topic has been sourced from ggplot2. Please check out the work for detailed information.

  • In addition to plotting your graph, you want to provide additional details to explain your graph
  • Text annotations are useful in this case
  • The annotate() function can be used for any kind of geometric object
  • In the annotate() function, the type of geom is specified first
  • Then, the positining is required (x and y coordinates in this case)
  • This is followed by the label
ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()+
   annotate("text", x = 4, y = 25, label = "Annotation Text")

Annotations in ggplot

Content for this topic has been sourced from ggplot2. Please check out the work for detailed information.

  • Further, annotations can be customized
 ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()+
   annotate("text", x = 4, y = 25, label = "Annotation Text", colour = "orange", size = 8)

 ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()+
   annotate("text", x = 1:5, y = 6, label = "Annotation Text", colour = "orange", size = 3)

Annotations in ggplot

Content for this topic has been sourced from ggplot2. Please check out the work for detailed information.

  • Similar to text annotation, other geoms can be used for annotations
  • However, instead of x and y, xmin and xmax is used for coordinates of the rectangle
  • Do you remember what alpha is used for?
ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()+
   annotate("rect", xmin = 4.8, 
            xmax = 5.7,
            ymin = 10,
            ymax = 18.6, 
            alpha = .2)

Annotations in ggplot

Content for this topic has been sourced from ggplot2. Please check out the work for detailed information.

  • Suppose you want to add a line segment to your graph
  • annotate() over here requires x and xend coordinates
ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()+
   annotate("segment", x = 4.8,
            xend = 5.7,
            y = 10,
            yend = 18.6,
            colour = "red")

Do it Yourself -1

Scales Functions in ggplot2 - Why?

  • When you create a graph, using ggplot2, the axes are mapped automatically based on the data
  • However, you would often need to change the axes in order to effectively present the data
  • the scale functions in ggplot2:
    • control how the data is plotted
    • allow manipulation of axes
    • improves overall appearances of the plot for effective data communication

Scales Functions in ggplot2

  • Look at the scatter plot of wt and mpg
ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()

  • What if you want both the axes to start from 0?
  • scale_y_continuous() allows you to set the range for the y-axis
  • limits inside the scale_y_continuous() provides limits of the scale
  • Over here, NA is used to refer to the existing maximum
ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()+
  scale_y_continuous(limits = c(0, NA))

Scales Functions in ggplot2

  • Instead of using NA, if you had to provide 40 as the limit for y-axis
ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()+
  scale_y_continuous(limits = c(0, 40))

Scales Functions in ggplot2 - Adding breaks

  • Setting breaks in the scale_y_continuous allows you to set what intervals the axis will have
ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point()+
  scale_y_continuous(breaks = seq(0, 40, 7))

Scales Functions in ggplot2

Content for this topic has been sourced from ggplot2. Please check out the work for detailed information.

  • The scale_colour_brewer() options are useful for plotting discrete values on your graph
  • The brewer scales provide sequential colour schemes from ColorBrewer
  • Look at the two charts
  • scale_colour_brewer helps in effcient mapping of discrete variables
ggplot(mpg, aes(x = displ, y = cty)) +
   geom_point(aes(colour = class))

ggplot(mpg, aes(x = displ, y = cty)) +
   geom_point(aes(colour = class))+
  scale_colour_brewer()

Do It Yourself -2

Scales Package

  • The scales package many scaling functions for visualizations
  • It allows for sophisticated customisation of data in a plot
  • Functions for readable and informative axes